(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 

International Bureau 

(43) International Publication Date 
2 August 2001 (02,08.2001) 




PCT 



(10) International Publication Number 

WO 01/55967 A2 



(51) International Patent Classification 7 : G06T 7/40 

(21) International Application Number: PCT/US0 1/02593 

(22) International Filing Date: 26 January 2001 (26.01.2001) 

(25) Filing Language: English 

(26) Publication Language: English 



(30) Priority Data: 

60/178,474 
09/770,833 



27 January 2000 (27,01 .2000) US 
25 January 2001 (25.01.2001) US 



(63) Related by continuation (CON) or continuation-in-part 
(CIP) to earlier applications: 

US Not furnished (CON) 

Filed on 25 January 2001 (25.01.2001) 

US 60/178,474 (CON) 

Filed on 27 January 2000 (27.0 1 .2000) 

(71) Applicant (for all designated States except US): AP- 
PLIED PRECISION, INC. [US/US]; 1040 12th Avenue 
N.W., Issaquah, WA 98027-8929 (US). 



(72) Inventors; and 

(75) Inventors/Applicants (for US only): BROWN, Carl, 

S. [US/US]; 5029 37th Avenue N.E., Seattle, WA 98105 
(US). GOODWIN, Paul, C. [US/US]; 113 N. 203rd 
Street, Shoreline, WA 98133 (US). 

(74) Agent: HAGLER, James, T.; Fish & Richardson RC, 
4350 La Jolla Village Drive, Suite 500, San Diego, CA 
92122 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CR, CU, CZ, 
DE, DK, DM, DZ, EE, ES, FI, GB, GD, GE, GH, GM, HR, 
HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK. LR, 
LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, 
NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM. 
TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW), Eurasian 
patent (AM, AZ, BY, KG, KZ, MD, RU. TJ, TM). European 
patent (AT, BE, CH, CY, DE, DK. ES, FI, FR, GB, GR. IE, 
IT, LU, MC, NL, PT, SE, TR), OAPI patent (BF, BJ, CF, 
CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG). 

[Continued on next page] 



=j (54) Title: IMAGE METRICS IN THE STATISTICAL ANALYSIS OF DNA MICROARRAY DATA 



Gene Expression Curve 



< 
ON 

in 



o 




1000 



1 Control Signal (Ftuoresctnc* Counts. Denominator. CY3, "Gr— n") 



j 

10000 I 



(57) Abstract: Expression profiling using DNA microarrays is an important new method for analyzing cellular physiology. In 
"spotted" microarrays, fluorescently labeled cDNA from experimental and control cells is hybridized to arrayed target DNA and the 
arrays imaged at two or more wavelengths. Statistical is performed on microarray images and show that non-additive background, 
high intensity fluctuations across spots, and fabrication artifacts interfere with the accurate determination of intensity information. 
The probability density distributions generated by pixel-by-pixel analysis of images can be used to measure the precision with which 
spot intensities are determined. Simple weighting schemes based on these probability distributions are effective in improving sig- 
nificantly the quality of microarray data as it accumulates in a mulu'experiment database. Error estimates from image-based metrics 
should be one component in an explicity probabilistic scheme for the analysis of DNA microarray data. 
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IMAGE METRICS IN THE STATISTICAL 
ANALYSIS OF DNA MICRO ARRAY DATA 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims benefit of U.S. Provisional Application No. 60/178,474, 
filed January 27, 2000. 

TECHNICAL FIELD 

This invention relates to DNA microarray analysis, and more particularly to using 
error estimates from image based metrics to analyze microarrays. 

BACKGROUND 

Large-scale expression profiling has emerged as a leading technology in the 
systematic analysis of cell physiology. Expression profiling involves the hybridization of 
fluorescently labeled cDNA, prepared from cellular mRNA, to microarrays carrying up to 
10 5 unique sequences. Several types of microarrays have been developed, but 
microarrays printed using pin transfer are among the most popular. Typically, a set of 
target DNA samples representing different genes are prepared by PCR and transferred to 
a coated slide to form a 2-D array of spots with a center-to-center distance (pitch) of 
about 200 um. In the budding yeast S. cerevisiae, for example, an array carrying about 
6200 genes provides a pan-genomic profile in an area of 3 cm 2 or less. mRNA samples 
from experimental and control cells are copied into cDNA and labeled using different 
color fluors (the control is typically called green and the experiment red). Pools of 
labeled cDNAs are hybridized simultaneously to the microarray, and relative levels of 
mRNA for each gene determined by comparing red and green signal intensities. An 
elegant feature of this procedure is its ability to measure relative mRNA levels for many 
genes at once using relatively simple technology. 

Computation is required to extract meaningful information from the large 
amounts of data generated by expression profiling. The development of bioinformatics 
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tools and their application to the analysis of cellular pathways are topics of great interest. 
Several databases of transcriptional profiles are accessible on-line and proposals are 
pending for the development of large public repositories. However, relatively little 
attention has been paid to the computation required to obtain accurate intensity 
information from microarrays. The issue is important however, because microarray 
signals are weak and biologically interesting results are usually obtained through the 
analysis of outliers. Pixel-by-pixel information present in microarray images can be used 
in the formulation of metrics that assess the accuracy with which an array has been 
sampled. Because measurement errors can be high in microarrays, a statistical analysis 
of errors combined with well-established filtering algorithms are needed to improve the 
reliability of databases containing information from multiple expression experiments. 

DESCRIPTION OF DRAWINGS 

These and other features and advantages of the invention will become more apparent 
upon reading the following detailed description and upon reference to the accompanying 
drawings. 

Figure 1 is a gene expression curve according to one embodiment of the present 
invention. 

Figure 2 is a graph illustrating the distribution of ratios calculated for each pixel 
of several spots according to one embodiment of the present invention. 

Figure 3 is a graph illustrating the sport intensity standard deviation plotted against 
the average signal. 

Figure 4 is a graph plotting covariance as a function of variance according to one 
embodiment of the invention. 

DETAILED DESCRIPTION 

The present invention involves the process of extracting quantitatively accurate 
ratios from pairs of images and the process of determining the confidence at which the 
ratios were properly obtained. One common application of this methodology is analyzing 
cDNA Expression Arrays (microarrays). In these experiments, different cDNA's are 
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arrayed onto a substrate and that array of probes is used to test biological samples for the 

presence of specific species of mRNA messages through hybridization. In the most 

common implementation, both an experimental sample and a control sample are 

hybridized simultaneously onto the same probe array. In this way, the biochemical 

process is controlled for throughout the experiment. The ratio of the experimental 

hybridization to the control hybridization becomes a strong predictor of induction or 

repression of gene expression within the biological sample. 

The gene expression ratio model is essentially, 

Experiment _ Expression 

Expression Ratio = : — 

~ Wild —type _ Expression 

Equation 1 

In typical fluorescent microarray experiments, levels of expression are measured 
from the fluorescence intensity of fluorescently labeled experiment and wild-type DNA. 
A number of assumptions are made about the fluorescence intensity, including: 1) the 
amount of DNA bound to a given spot is proportional to the expression level of the given 
gene; 2) the fluorescence intensity is proportional to the concentration of fluorescent 
molecules; and 3) the detection system responds linearly to fluorescence. 

By convention, the fluorescent intensity of the experiment is called "red" and the 
wild-type is called "green". The simplest form of the gene expression ratio is 

R = ^- 
g 

Equation 2 

where r and g represent the number of experiment and control DNA molecules 
that bind to the spot. R is the expression ratio of the experiment and control. 

In real situations, however, r and g are unavailable. The measured values, r m and 
g m , include an unknown amount of background intensity that consists of background 
fluorescence, excitation leak, and detector bias. That is, 
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gm=S + Sb 

Equation 4 

where r b and g b are unknown amounts of background intensity in the red and 
green channels, respectively. 

Including background values, the gene expression ratio becomes: 

gm gb 

Equation 5 

Equation 5 shows that solving the correct ratio R requires knowledge of the 
background intensity for each channel. The importance of the determining the correct 
background values is especially significant when r m and g m are only slightly above than r b 
and g b . For example, the graph 100 in Figure 1 shows the effect of a ten count error in 
determining r b or g b , in the case where R is known to be one. The expected result is 
shown as line 105. A ten count error in the denominator is shown as line 1 10. A ten 
count error in the numerator is shown as 115. 

In experimental situations, the sensitivity of the gene expression ratio technique 
can be limited by background subtraction errors, rather than the sensitivity of the 
detection system. Accurate determination of r b and g b is thus a key part of measuring the 
ratio of weakly expressed genes. 

Rearranging equation 5, gives 

r„=R(g m -g b ) + r h =Rg m +* 

Equation 6 

where 

k = r h -Rg b . 

Equation 7 

Least squares curve-fit of equation 6 can be used to obtain the best-fit values of R 
and Jfc, assuming that r b and g b are constant for all spot intensities involved with the curve- 
fit. The validity of this assumption depends upon the chemistry of the microarray. Other 
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background intensity subtraction techniques, however, can have more severe limitations. 
For example, the local background intensity is often a poor estimate of a spot's 

background intensity. 

Two approaches have been taken to the selection of spots involved with the 
background curve-fit. Since most microarray experiments contain thousands of spots, of 
which only a very small percent are affected by the experiment, it is probably best to 
curve-fit all spots in the microarray to equation 6. A refinement of this method is to use 
all spots that have no process control defects. Another alternative is to include ratio 
control spots within the array and use only those for curve fitting. The former two are 
preferred, because curve fitting either the entire array or at least much of it yields a strong 
statistical measurement of the background values. In the case where the experiment 
affects a large fraction of spots, however, it may be necessary to use ratio control spots. 

Constant k is interesting because it consists of a linear combination of all three 
desired values. While it is not possible to determine unique values of r b and g b from the 
curve-fit, there are two types of constraints that can be used to select useful values. First, 
the background values must be greater than the bias level of the detection system and less 
than the minimum values of the measured data. That is, 

Constraint 1: 

r hiax ^r h <r m ^ 

Equation 8 

Sbias — &h ~ & m^jj 

Equation 9 

The second type of constraint is based on the gene expression model. For genes 
that are unaffected by the experiment and are near zero expression, both the experiment 
and the control expression level should reach zero simultaneously. In mathematical 
terms, when r-*0, then g-+0. A linear regression of (r m -r b ) versus (g m -gb) should then 
yield a zero intercept. That is, selection of appropriate r b , g b should yield linear 
regression of 

(r m -r b ) = m(g m -g b ) + b 
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Equation 10 

such that b is approximately zero. This occurs when 

b = mg h -r h 

Equation 11 

The pair of values r h and g b that create a zero intercept of the linear regression is 
thus the second constraint that can be used for extracting the background subtraction 
constants. Solving equations 7 and 1 1 for g b gives 

Constraint 2: 

k + b 

Sh = 

m — R 

Equation 12 

where R and k come from the best-fit of 6 ? and m and b come from linear 
regression of 1 0. The background level r b can then be calculated by inserting g b into 
equation 7 or 1 1 . 

Although it is almost always possible to generate a curve-fit of the microarray 
spot intensities, it is not always possible to satisfy constraints 1) and 2), especially at the 
same time. Failure to satisfy constraint 1) is an indication that the experiment does not fit 
the expected ratio model or that one of the linearity assumptions is untrue. 

A somewhat trivial explanation of a failure to satisfy the constraints is that the 
spot intensities have been incorrectly determined. A common way that this happens is 
that the spot locations are incorrectly determined during the course of analysis. 

Under ideal circumstances, one would also expect that the linear regression slope, 
m 9 should equal the best-fit ratio R. This can also be used as a measure of success. At 
the same time, the linear regression intercept b should equal zero when the r b and g b meet 
constraint 1). 

Ratio Distribution Statistics 

The measured values of the numerator and denominator are random variables 
with mean and variance. That is, 
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r m -n =r ±r< 



SD 



g m -g h z= g ± Sso 

Equation 13 

where r and g are mean values and r S D and g S p are standard deviations of r and g. 
The ratio R of r and g is then a random variable too with an expected value R E and 
variance Rsd> That is, 



R = R IS ±R SD = 



r±rso 



g±g S D 

Equation 14 

Assuming that the measurement of numerator and denominator are normally 
distributed variables, an estimate of Re and R SD can be formed from Taylor series 
expansion. 



L. 2 r 



Equation 15 



Equation 16 

where cr rg is the covariance of the numerator and denominator summed over all 
image pixels in the spot. 



Equation 17 



Coefficient of Variation 

Assessing the quality of microarray scans and individual spots within an array is 
an important part of scanning and analyzing arrays. A useful metric for this purpose is 
the coefficient of variation (CV) of the ratio distribution, which is simply 
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R 

Equation 18 

In effect, the CV represents the experimental resolution of the gene expression 
ratio. Minimizing the CV should be the goal of scanning and analyzing gene expression 
ratio experiments. Minimization of Rsd is the best way to improve gene expression 
resolution. The graph 200 in Figure 2 gives two examples of comparisons between 
scanned images. The ratios of a first spot are shown in line 205, while the ratios of a 
second spot are shown as line 210. The first spot has a CV of 0.21, while the second spot 
has a CV of 0.60. The distribution of the ratios in the first spot 205 have a narrower 
distribution than the rations of the second spot 210. Thus, the first spot is a better spot to 
analyze. 

Equation 1 6 shows that the variability of the ratio decreases dramatically as a 
function of g, which is a well understood phenomenon. Dividing by a noisy 
measurement that is near zero produces a very noisy result. 

The ratio variance has an interesting dependence on the covariance a rg . Large 
values of a rg reduce the variability of the ratio. This dependence on the covariance is not 
widely known. In the case of microarray images, strong covariance of the numerator and 
denominator is a result of three properties of the image data: good alignment of the 
numerator and denominator images, genuine patterns and textures in the spot images, and 
a good signal-to-noise ratio (r/rso and glgs6)- 

Table 1 summarizes how variables combine to reduce Rsd- 
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order to reduce the ratio distribution variance. 
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Spot CV 

The CVis a fundamental metric and represents the spread of the ratio distribution relative 
to the magnitude of the ratio. Figure 3 shows that even though the second spot has a 
higher ratio than the first spot, the CV is 3X higher. The uncertainty of the second spot's 
ratio is far greater than the first spot's ratio. Thus, in addition to being useful for 
separating spots from the control population, the CV can also serve as an independent 
measure of a spot's quality. 

Average CV 

The average CV of the entire array of spots gives an excellent metric of the entire 
array quality. Scans from array WoRx alpha systems have been shown to have 
approximately l A the average CV of a corresponding laser scan. 

Normalized Covariance 

Covariance is known to be an indicator of the registration among channels, as 
well as the noise. Large covariance is normally a good sign. Low covariance, however, 
doesn't always mean the data are bad; it may mean that the spot is smooth and has only a 
small amount of intensity variance. Likewise, high variance is not necessarily bad if the 
variance is caused by a genuine intensity pattern within the spot. Figure 3 demonstrates 
that standard deviation increases with increasing spot intensity; the dependence is 
approximately linear. Figure 3 also shows that the observed standard deviation is not 
simply caused by the statistical noise associated with counting discrete events (statistical 
noise). Spots that have a substantial intensity pattern caused by non-uniform distribution 
of fluorescence will have a large variance and a large covariance (if the detection system 
is well aligned and has low noise). 

Thus, to make the covariance and the variance values useful they must be 
normalized somehow. In general, this can be accomplished by dividing the covariance 
by some measure of the spot's intensity variance. To determine the spot's variance, one 
could select one of the channels as the reference (for example the control channel, which 
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is green), or one could use a combination of the variance from all channels. The 
following table gives examples of the normalized co variance calculation: 



Normalization Method 

anr^f*^ PinnPfl in 

V Ctl 1 tljtiJ ClV-tVJ wvl kiX 


Normalized co variance 
calculations 

rr Equation 19 


quadrature 








Variances added 






Equation 20 






(cr r +cr fi ) 
2 




Control channel 
variance only 




Equation 2 1 


Experiment channel 
variance only 






Equation 22 



where, & rg is the normalized covariance, and a n and a g are the ^'ariances of 
channels 1 and 2, respectively. 

Figure 4 illustrates a plot 400 of all the spot's covariance values versus their 
average variance (as in equation 20). The plot 400 of Figure 4 reveals that the 
normalized covariance is a very consistent value. The slope of the points in Figure 4 
gives the typical value of the normalized covariance. (The average of the normalized 
covariance would give a similar result.) Outliers on the graph are almost always below 
the cluster of points along the line. Such outlying points occur when the intensity 
variance of the spot is unusually high, relative to the covariance. A study of these points 
shows that they have some sort of defect, which is often a bright speck of contaminating 
fluorescence. 

Covariance/Variance Correlation of the Entire Array 

Systematically poor correlation between covariance and variance can also point to 
the scanner's inability to measure covariance due to poor resolution, noise, and/or 
channel misalignment. Linear regression of the points in Figure 4 gives an indication of 
the scanner's ability to measure covariance. A broad scatter plot obviously indicates poor 
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correlation: the variance of the spot intensities is inconsistent between the channels. A 
low slope indicates that the scanner has relatively high variance, relative to it's ability to 
measure covariance. Thus, one could compare scanners by comparing the slope and 
correlation coefficient of a linear regression of Figure 4 (when the same slide is scanned). 
A good scanner has a tight distribution with large slope and outliers indicate array 
fabrication quality problems rather than measurement difficulties. The average and 
standard deviation of the normalized covariance give similar results and could be used 
instead of the slope and correlation coefficient, respectively. 
Spot Intensity Close to Local Background 

Spots that are close, or equal, to local background may be indistinguishable from 
background. A statistical method is employed to determine whether pixels within the 
spot are statistically different than the background population. 
Spot Intensity Below Local Background 

Spot intensities below the local background are a good example of how the local 
backgrounds are not additive. Such spots are not necessarily bad, but are certainly more 
difficult to quantify. This is a case where proper background determination methods are 
essential. The method described above can make use of such spots, provided that there is 
indeed signal above the true calculated background. 
Ratio Inconsistency (Alignment Problem or "Dye Separation") 

This metric compares the standard method of measuring the intensity ratio with an 
alternative method. The standard method uses the ratio of the average intensities, as 
described above. The alternative measure of ratio is the average and standard deviation 
of the pixel-by-pixel ratio of the spot. For reasonable quality spots, these ratios and their 
respective standard deviations are similar. 

There are two main source causes of inconsistency. Either the slide preparation 
contains artifacts that affect the ratio, or the measurement system is unable to adequately 
measure the spot's intensity. The following table lists more details about each source of 
inconsistency. 



Slide Preparation 


Measurement Problems 


Probe separation 


Noise 
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Target separation 


Misregistration 


Contamination with fluorescent 
material 


Non-linear response between channels. 



covariance. In the case of slide preparation problems, the ratio inconsistency points to 
chemistry problems, whereas measurement problems point to scanner inadequacy. 

Numerous variations and modifications of the invention will become 
readily apparent to those skilled in the art. Accordingly, the invention may be embodied 
in other specific forms without departing from its spirit or essential characteristics. 
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WHAT IS CLAIMED IS: 

1 . A method of determining a background intensity an image comprising: 
selecting a plurality of spots within the image falling within a least squares curve 

fit; and 

determining a constant background intensity for the spots within the curve fit. 

2. The method of Claim 1, further comprising determining a ratio an 
experimental image to a control image. 

3 . The method of Claim 2, further comprising determining the least squares 
curve fit from the equation: 

r m = R(g m -g b ) + r t- R S m + k 
where r m and g m are the measured values of the images, r b , and g b are the 
background intensities of the images, and k is a constant. 

4. The method of Claim 3, further comprising applying a constraint so the 
background intensities are greater than the bias levels. 

5. The method of Claim 3, further comprising applying a constraint so the 
background intensities create a zero intercept of a linear regression of the equation: 

(r m -r h ) = m(g m -g h ) + b 

Equation 13 

such that b is approximately zero, which occurs when 

b = mg b -r b . 

6. The method of Claim 5, further comprising extracting the background 
subtraction constants . 

7. A method of selecting a microarray scan for analysis comprising: 
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determining a coefficient of variation for the microarray scan; 

comparing the coefficient of variation to a predetermined threshold; and 

selecting a microarray scan if the coefficient of variation is lower than the 
predetermined threshold. 

8. The method of Claim 7, further comprising determining the coefficient of 
variation from the equation: 



CV = - 



R 



-? 2 

r- r sn ^ r 



where R SD « p + -±r - 2cr,« -p- . 

9. The method of Claim 7, further comprising determining a spot coefficient 
of variation. 

10. The method of Claim 7, further comprising determining an average 
coefficient of variation. 

11. A method of extracting data from an image comprising: 
determining a co variance and a variance the of the image; 
normalizing the covariance; 

determining the average and standard deviation of the covariance; and 
selecting the data based on the average and standard deviation of the covariance. 

12. The method of Claim 1 1, further comprising calculating the covariance 
according to the following equation: 
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13. The method of Claim 11, further comprising normalizing the covariance 
by adding the variances in quadrature according to the following equation: 

oL--?*— 



where a rg is the normalized covariance, and a>, and a g are the variances of the 
control and experimental channels. 

14. The method of Claim 1 1 , further comprising normalizing the covariance 
by adding the variances according to the following equation: 

tr' = i — ^ 



where ef rg is the normalized covariance, and a r , and cr g are the variances of the 
control and experimental channels. 

1 5 . The method of Claim 1 1 , further comprising normalizing the covariance 
by using a control channel variance according to the following equation: 

where cf rs is the normalized covariance, and o>, and a g are the variances of the 
control and experimental channels. 

16. The method of Claim 1 1, further comprising normalizing the covariance 
by using an experimental channel variance according to the following equation 
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where cr' rg is the normalized covariance, and o> ; and er g are the variances of the 
control and experimental channels. 

1 7. A method of extracting data from an image comprising: 
determining a covariance and a variance the of the image; 
determining the slope of the covariance plotted against the variance; and 
selecting the data where the slope exceeds a predetermined threshold. 

18. The method of Claim 1 7, further comprising plotting each covariance 
value versus the average variance values. 

19. The method of Claim 1 7, further comprising ignoring data points not 
along the slope of the covariance plotted against the variance. 

20. The method of Claim 1 7, further comprising performing linear regression 
of the covariance plotted against the variance to create a distribution of data points. 

21. The method of Claim 20, further comprising selecting an image having a 
tight distribution of data points. 
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